Clock frequency adjustment for semi-conductor devices

ABSTRACT

A method and apparatus are provided for clocking data processing modules, which require differing average clock frequencies, and for transferring data between the modules. This comprises a means for providing a common clock signal to modules. Clock pulses are deleted from the common clock signal to individual modules in dependence on the clocking frequency required by each module. Clock pulses are applied to modules between which data is to be transferred at times consistent with the data transfer.

FIELD OF THE INVENTION

This invention relates to clock frequency adjustment for semi-conductor devices and in particular to adjustment of clock frequencies for semiconductor devices which comprise a plurality of modules which are clocked at different rates, most typically, multiple processing elements provided upon a system on chip (SoC).

BACKGROUND TO THE INVENTION

As semi-conductor devices are becoming smaller and smaller, system on chip devices are being produced with more and more different processing elements integrated on the same chip. These processing elements would previously have been provided as separate semi-conductor devices.

Semi-conductor devices perform their functions in response to clock signals which are provided at one or more inputs to the device and divided and distributed internally to the various processing elements. Where different processing elements form part of the same system and signals are transferred between them, a common clock is normally used. If the various processing elements have differing clock frequency requirements then some consideration needs to be given to how data is transferred between those modules. Data transfer can be kept simple if modules are clocked at integer clock ratios with respect to each other. It is seen, however that if arbitrary clock ratios are required then more expensive and complex synchronisation is required, and this can result in an increased latency penalty on data transfers.

A typical example SoC device is shown in FIG. 1. This comprises a plurality of modules, 2, 4, and 6. These may be any combination of central processing units, co-processors, interfaces, arbitration units, or any other circuitry required by the SoC that is driven by a clock.

A master clock signal 8 is provided to phase lock loop (PLL) 10. Module 1 takes the clock signal from the PLL 10 as clock 1 and performs its functions at this clocking rate.

Modules 2 and 3 do not need to run as fast as the frequency of clock 1 which is provided to module 1. Therefore, the clock signal to module 2 (clock 2) passes through a divide by N unit 12 and the clock input to module 3 (clock 3) passes through a divide M unit. In this case, N and M are integer amounts. Clocking module 2 and module 3 at a lower rates minimises power consumption by these modules.

It will appreciated that provided M and N are kept at simple integer ratios and all three clocks are carefully synchronised, the transfer of data between the modules may be kept relatively simple. For example, it can be arranged that module 1 will only update its outputs to and read its inputs from module 2 every N clock cycles. Similarly, module 1 will only transfer data to and read its inputs from module 3 once every M clock cycles.

If other ratios of M and N are used (i.e. non integer values) the transfer of data between modules can become more complex and it can become necessary to insert a first in first out buffer (FIFO) or some special synchronisation logic 16 as shown between modules 2 and 3 in FIG. 1.

Some processing systems will have processing requirements which are dynamically variable. Thus, a circuit such as FIG. 1 could use module 1 to perform a significant amount of processing before module 2 performed any processing at all. If module 2 were clocked at the same rate as module 1 then there would be unnecessary power consumption by module 2 in the first stage of the process. If module 2 is operated at a lower clock frequency then the second part of the process would be performed more slowly than the first part. Dynamic variation of clock speed is desirable to provide optimal processing rates in such situations but difficult to achieve, especially when there is a requirement to transfer data between modules running at different clock frequencies.

SUMMARY OF THE INVENTION

Preferred embodiments of the present invention provide a system in which any module's effective clock rate can be fine tuned. This is achieved by providing synchronised clock signals to each module in an SoC where each clock signal is being run at the same basic rate. In order to reduce the clocking rate for a module, a clock gating cell is provided in the clock input line. This is under the control of a clock deletion control unit which controls the clock gating cell to delete clock pulses which are not required in order to achieve an apparent lower clock rate, or a variable clock rate.

Preferably the clock deletion control unit can be set up to delete any arbitrary number of clock pulses from a master clock signal in a set period to achieve the effective clock frequency required. Furthermore, the clock deletion control unit can be modified to regulate the effective clock frequency as required by algorithms running on the processing elements or modules. This control may be via a register setting, possibly modifiable by software running on the SoC, or alternatively it can be dynamically set to an optimum value via a metric generated within the algorithms running within the module to which the clock pulses being supplied.

Each module within an SoC may have its own dedicated clock deletion control unit and clock gating cell so that each may be run at different clocking rates. Safe data transfer between modules running at different clocking rates is handled by either forcing a clock pulse on the sending and receiving modules when there is data to be transferred, or making use of an existing two way handshake to deliberately stall the transfer of data until appropriate clock pulses happen to occur on both the sending and receiving modules. A combination of these methods may also be used.

BRIEF DESCRIPTION OF DRAWINGS

An embodiment of the invention will now be described in more detail by way of example with reference to the accompanying drawings in which:

FIG. 1 is the prior art system referred to above;

FIG. 2 shows a clock gating cell and clock deletion control unit in accordance with an embodiment of the invention;

[AIDAN: Valid-Enable and Valid only are the names Imagination Technologies give to a particular protocol but are not known industry wide—hence the modifications here]

FIG. 3 shows the timing wave forms of a non-stalling (valid only) protocol;

FIG. 4 shows the timing wave form for data transfer between modules with a specific two way handshake. (valid-enable protocol);

FIG. 5 shows an embodiment of the invention in which clock pulses are forced onto modules to ensure correct data transfer;

FIG. 6 shows an embodiment of the invention in which a two way handshake protocol is modified to ensure correct data transfer;

FIG. 7 shows one arrangement of the clocking configuration for use with an embodiment of; and,

FIG. 8 shows an optimised clocking configuration for transfer between modules 1 and 2 in FIG. 6

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

In a preferred embodiment of the invention, the divide by N and the divide by M units 12, 14 of FIG. 1 are replaced by the clock deletion unit of FIG. 2. This comprises a clock gating cell 20 which is positioned between the clock and the clock input to a module. This clock gating cell 20 also receives an input from a clock deletion control unit 22. This clock deletion control unit 22 also receives the same clock input as the clock gating cell. In addition, it receives a control input 24 which contains data relating to the required clocking rate, which is to be applied to the module in question. In response to the control input 24, the clock deletion control unit 22 generates a series of pulses which get applied to the clock gating cell 25 which in turn cause a clock pulse to be generated at the module.

Thus, the clock deletion control unit of FIG. 2 can be set up to delete any arbitrary clock pulses from the master clock signal within a set clock period, and in any arbitrary order to achieve the effective clock frequency required. Furthermore, the control input 24 may be modified at any time to change the effective clock frequency as required by the algorithms running on the module.

In a preferred embodiment, each of the divide by N and divide by M units 12, 14 in FIG. 1 will be replaced by a clock deletion unit of the type shown in FIG. 2. Indeed, it may be preferable to provide a clock deletion unit for each of the 3 modules in FIG. 1, thereby ensuring that each can be controlled at a varying rate, whereby any module which is not required to perform processes at a particular time may have its clocking pulses removed by a clock deletion unit.

When an SOC is in operation, it will from time to time be necessary to transfer data between two or more modules. When this is necessary, it must be ensured that the modules are clocked at the appropriate times. This can be achieved in a number of ways including the following;

-   -   1. by forcing a clock pulse on both modules at the appropriate         times when there is valid data to transfer, or,     -   2. by making use of existing two-way handshake wires to         naturally control the data flow from one module to the other         where the transfer is recognised at both sides. The use of the         handshake signal is subverted to allow only data to flow on the         occasions when there happens to be a coincident clock pulse on         both modules

The two-way handshake protocol used here for illustration is named “valid-enable” which recognise data transfer from one module to the next on the same clock.

It is also possible to accommodate two-way handshake protocols which recognise transfer on different clocks. These require different specific logic designed around the interface protocol for the data transfer.

We will now describe examples of the two methods of transferring data between modules which may be used in embodiments of the present invention. In particular, we show methods for transferring data between modules which are clocked at different effective rates. Both techniques may be used on the same SoC between any number of modules running at any number of effective clock rates. The best choice for the method selected will depend on whether or not a one or two way hand shake protocol is available. It may also depend on the expected characteristics of the data transfer by the selected interface.

The first example is a non-stalling (known here as ‘valid-only’) protocol which can be used at an appropriate module interface. The signals used in the data transfer are shown in FIG. 3. As can be seen, there is a clock signal shown on the top line. The second line represents a handshake wire ‘valid’ which, when high indicates that the ‘Data’ wires have a value to be transferred. Valid data to be transferred in shown in FIG. 3 as D1, D2. D3 and D4.

In the case where both modules are driven by the same clock, the ‘valid-only’ protocol works without problem. If however the modules are driven at different rates with different clock control settings there is potential for valid data to get lost, or for single words of valid data to get mis-interpreted as multiple words. To avoid this happening we take the valid signal and combine it with the clock gating signals from each of the clock control units to force a clock pulse on each module whenever there is valid data to be transferred.

A specific embodiment, is illustrated in FIG. 5. This is a system in which a coincident clock pulse is forced on both sending and receiving modules when data is ready to be transferred. FIG. 5 shows two modules which can be clocked at different effective clock rates by respective clock deleter circuits. The two modules are modules 1 and module 2. Module 1 has a clock deletion control unit 42 which receives clock control 1. Module 2 has a clock deletion control unit 44 which receives clock control 2 at its control input.

Each of the clock deletion control units 42 and 44 provide control signals to their respective clock gating cells 46 via a respective OR gate 48. The same clock signal 50 is provided to each clock deletion control unit 42 and 44 and to the two clock gating cells 46.

Module 1 has to transfer data to module 2. When it is ready to transfer that data it produces a valid signal 52 which is applied to module 2 and which is also applied to the second input of each of the OR gates 48. The effect of this is to cause the output of each OR gate 48 to be enabled irrespective of the outputs of the clock deletion control units.

Thus, the OR gates 48 provides enable signals to their respective clock gating circuits 46. In response to the valid signal 52 of the outputs of the respective clock deletion control units 42 and 44, causing the clock signal 50 to pass through the respective clock gating cell 46 when the output of the respective OR gate 48 is enabled.

Sophisticated implementations would account for the additional pulses by deleting extra pulses later so the aggregate clock count matches the required rate over a period of time.

When a handshake mechanism is present which allows the receiving module to stall data transfer an alternative mechanism is employed to ensure correct data transfer when the modules are clocked at different effective rates. The example protocol used for illustration is known here as a valid-enable transfer which is a two way handshake protocol. The protocol is illustrated in FIG. 4. The first line shows the clock signal. The second line shows the ‘valid’ signal which originates from the sending module and indicates that the value on the data wires is of interest and to be sent to the receiving module. The third line shows the ‘enable’ signal which originates from the receiving module and indicates that the module is ready to accept data. When both the valid and enable signals are high, data is transferred from the sending module to the receiving module, shown in FIG. 4 as D1, D2, D3 and D4. Without special treatment, this protocol would also suffer from incorrect data transfer if the sending and receiving modules were clocked by different effective clock rates. To avoid this we make use of the handshake signals themselves to ensure that we only attempt to transfer data when there are appropriate clocks on both modules.

A specific embodiment of this alternative arrangement for transferring data between modules in a system on chip is shown with reference to FIG. 6. This type of arrangement uses a two way hand shake between modules whereby one processing element can stall back another processing element which wishes to make a data transfer. The effect of this is to ensure that data transfer is only possible when there happens to be coincident clock pulses applied to both sending and receiving modules.

In this arrangement, module 1 has an enable input which is asserted in response to the output of an AND gate 60. Module 2 correspondingly has a valid input which is asserted by the output of an AND gate 62. The enable input to module 1 permits it to send data to module 2 and the valid input of module 2 permits it to receive data from module 1.

A first input of the AND gate 60 is an enable signal produced by module 2 when it is in a state in which it is ready to receive data from module 1. A first input of the AND gate 62 is a valid output from module 1 which is produced when it is able to send data to module 2. The respective second inputs of the AND gates 60 and 62 are provided by a clocking circuit 64.

The clocking circuit 64 has a clock input 66. This clocking signal is sent to two clock gating circuits of the type described with reference to FIG. 2. Module 1 has an clock gating unit comprising a clock deletion control unit 42 receiving a clock control signal 1 at its clock control input. The output of this and the clock signal 66 are provided to its clock gating cell 48 which provides a clock signal to module 1.

Correspondingly, for module 2, a clock deletion control unit 44 receives a clock control signal at its control input and provides an output to its clock gating cell 48 which in turn provides a clocking signal to module 2.

The output of the two clock deletion control units 42 and 44 are also provided to a further AND gate 68. The output of this forms the second input to the two AND gates 60 and 62. Thus, when the two clock control signals cause the respective clock deletion control units 42 and 44 to provide enabling pulses to their respective clock gating cells in 48, the output of AND gate 68 is asserted, thereby permitting data to pass from module 1 to module 2 if module 1 produces a valid signal on its valid output line and module 2 produces an enable signal on its enable line, i.e. when module 1 is ready to send data and module 2 is also ready to receive data. When this happens, data is sent from module 1 to module 2 in response to the clock signals provided at their respective clock inputs by the respective clock gating cells.

This arrangement works most effectively when the clock deletion circuits have a maximum number of coincident clock pulses between them, thereby minimising the chance of one module being unnecessarily stalled whilst waiting for data for transfer to or from the other. A dotted line is shown between the two clock deletion control units in FIG. 6 and this represents linkage between the two clock control inputs to provide some synchronization and thereby ensure maximum number of coincident clock pulses.

In these embodiments of the invention, clock control signals may be hard wired to a constant if no control of the clock rate is required. Alternatively, they may be wired to a register so that the clock rate may be controlled by software running on a processor. Alternatively the control signals may be dynamically adjusted by the module whose clock is being controlled, or indeed by any other module responsible for controlling the clocking rates of other modules.

In the dynamic control case, a metric may be used to provide an indication of whether or not the module being clocked is operating correctly in meeting its real time requirements or whether some adjustment to the clocking speed is required. This metric could be generated, for example, by using the fullness of an appropriate FIFO (first in first out buffer) or other hardware that could be constructed to provide an indication of how much the module is over or under performing. This metric can then be fed back to drive directly the clock control signals via suitable scaling and offsetting.

Preferably, each of the plurality of modules is clocked with the minimum possible number of clock pulses in any given period of time. It is generally possible to calculate or deduce the minimum clock frequency each module needs to be clocked at to operate its task. The clocking may be controlled within a time period to have periods of inaction and periods of higher frequency clocking if the flow of data within the system within which it is operating dictates that this is required.

It is preferable to maximise the number of concurrent clock pulses between modules so that data is more likely to be transferred between modules as and when it becomes available, rather than having to wait and possibly slow down the system. For example the system may have 3 modules that are driven by clock 1, clock 2 and clock 3 or driven from a common master clock. It may have been deduced that the new clock rate required for clock 1, clock 2 and clock 3 are 4, 8 and 3 pulses respectively for every 16 clock periods. A possible configuration for this is shown in FIG. 7 with the clock pulses required for clock 1, clock 2 and clock 3. In this example, all the clock pulses occur at the beginning of a 16 period cycle for a respective minimum number of clock cycles to maximise the number of concurrent clock pulses.

It is also desirable to take account of the expected rates at which modules produce or receive data and modify the clocking pattern appropriately. For example, if module 1 delivers data to module 2 on average one word every 2 clock pulses, module 2 needs 4 clock pulses to deal with each word it receives, the arrangement of clock pulses shown in FIG. 7 will not be appropriate. For this situation, a FIFO buffer between the modules would be required to maintain smooth data flow. Alternatively, this requirement can be eliminated by arranging the waveforms differently as shown in FIG. 8. In this, the clock pulses applied to clock 1 are spread to enable module 2 to process the data as it receives it from module 1.

To produce waveforms with characteristics such as shown in FIG. 8, the clock deletion control unit will need to be configured with frame length corresponding to the number of clock periods before a particular cycle restarts. It would also need to know the number of active cycles, i.e. the number of clock periods within a frame for which a clock pulse is generated, and the number of clock periods between output pulses. This would then enable its clock pulse to be altered to ensure optimum flow of data between modules. 

1. A method for clocking data processing modules which require differing average clock frequencies and for transferring data between the modules comprising the steps of providing a common clock signal to all modules, deleting clock pulses provided from the common clock signal to individual modules in dependence on the clocking frequency required by each module, and applying clock pulses to modules between which data is to be transferred at times consistent with the data transfer to be performed.
 2. A method according to claim 1 in which clock pulses are applied to modules at times consistent with data sourcing and sinking capabilities of the modules between which data is to be transferred.
 3. A method according to claim 1 in which clock pulses are applied to modules substantially synchronously to perform data transfer.
 4. A method according to claim 2 in which clock pulses are applied to modules at differing times to perform data transfer.
 5. A method according to claim 1 in which clock pulses are applied to modules between which data is to be transferred in response to interface protocol signals related to the data transfer.
 6. A method according to claim 5 including the step of adjusting the number of clock pulses subsequently applied to a module in dependence on the number of clock pulses applied in response to the interface protocol signals related to the data transfer.
 7. A method according to claim 1 in which the step of applying clock pulses to be applied comprises providing a signal from a first module indicating that it is ready to transfer data, and applying the clock pulses to the first module and to a second module for receiving the data in response to the signal from the first module.
 8. A method according to claim 7 in which the step of applying the clock pulses to the first and second modules comprises controlling the deleting step such that clock pulses are provided synchronously to the first and second modules.
 9. A method according to claim 8 in which the step of applying the clock pulses comprises inhibiting the deleting step such that all clock pulses pass to the first and second modules while the deleting step is inhibited.
 10. A method according to claim 1 comprising the steps of performing a handshake operation between modules between which data is to be transferred and wherein the handshake operation is gated such that data transfer is inhibited when appropriate patterns of clock pulses are not available.
 11. A method according to claim 10 in which the handshake operation includes providing a first signal from a first module indicating that it is ready to transfer data and a second signal from a second module indicating that it is able to receive data, and passing the first signal to the second module and the second signal to the first module in the handshake operation in response to appropriate clock signals being applied to the first and second modules.
 12. A method according to claim 11 in which the clock signals are applied to the first and second modules in response to coincident clock gating signals being applied to respective clock gating circuits for the first and second modules.
 13. A method according to claim 12 in which the clock gating circuit performs the deleting step.
 14. A method according to a preceding claim 1 including the step of dynamically altering the clock frequency required by each module.
 15. A method according to any preceding claim 1 including the step of monitoring a metric to determine whether any adjustment to clocking frequency of a module is required.
 16. Apparatus for clocking data processing modules which require differing average clock frequencies and for transferring data between the modules comprising means for providing a common clock signal to all modules, means for deleting clock pulses provided from the common clock signal to individual modules in dependence on the clocking frequency required by each module, and means for enabling clock pulses to be applied to modules between which data is to be transferred at times consistent with the data transfer to be performed.
 17. Apparatus to claim 16 in which clock pulses are applied to modules at times consistent with data sourcing and sinking capabilities of the modules between which the data is to be transferred.
 18. Apparatus according to claim 16 in which the means for enabling clock pulses to be applied to modules enables the clock pulses to be applied substantially synchronously to perform data transfer.
 19. Apparatus according to claim 17 in which the means for enabling clock pulses to be applied to modules enables this to happen at differing times.
 20. Apparatus according to claim 16 in which the means for enabling clock pulses to be applied to modules does so in response to interface protocol signals related to the data transfer.
 21. Apparatus according to claim 20 including means for adjusting the clock pulses subsequently to a module in dependence on the number of clock pulses applied in response to the interface protocol signals related to the data transfer.
 22. Apparatus according to claim 16 in which the means for enabling clock pulses to be applied to modules comprises means for providing a signal for the first module indicating that it is ready to transfer data, and means for applying the clock pulses to the first module and to a second module for receiving the data in response to the signal from the module.
 23. Apparatus according to claim 22 in which the means for applying the clock pulses to the first and second modules comprises means for controlling the deleting steps such that clock pulses are provided sequentially to the first and second modules.
 24. Apparatus according to claim 23 in which the means for applying the clock pulses comprises means for inhibiting the deleting means such that all clock pulses passed to the first and second modules while the deleting means are inhibited.
 25. Apparatus according to claim 16 comprising means for performing a handshake operation between modules between which data is to be transferred and a means for gating the handshake operation such that data transfer is inhibited when appropriate patterns of clock pulses are not available.
 26. Apparatus according to claim 25 in which the means for performing a handshake operation includes the means for providing a first signal from a first module indicating that it is ready to transfer data and a second signal from a second module indicating that it is able to receive data and means for passing the first signal to the second module and the second signal to the first module in the handshake operation in response to appropriate clock signals being applied to the first and second modules.
 27. Apparatus according to claim 26 in which the means for applying clock signals to the first and second modules does so in response to coincident gating control signals being applied to a respective gating circuit for the first and second modules.
 28. Apparatus according to claim 27 in which the deleting means comprises the clock gating circuit.
 29. Apparatus according to claim 16 including means for dynamically altering the clocking frequency required by each module.
 30. Apparatus according to claim 16 including means for monitoring a metric to determine whether any adjustment to clocking frequency of a module is required.
 31. A method for clocking data processing modules which require differing average clock frequencies and for transferring data between the modules comprising the step of providing a common clock signal to all modules, deleting clock pulses provided from the common clock signal to individual modules in dependence of the clocking frequency required by each module, performing a handshake operation between modules between which data is to be transferred and controlling the handshake operation such that data transfer only occurs in the presence of appropriate patterns of clock pulses applied to the two modules.
 32. Apparatus for clocking data processing modules which require differing average clock frequencies and for transferring data between the modules comprising means for providing a common clock signal to all modules, means for deleting clock pulses provided from the common clock signal to individual modules independence on the clock frequency required by each module, means for performing a handshake operation between modules, between which data is to be transferred, and means for controlling the handshake operation such that data transfer only occurs in the presence of appropriate patterns of clock pulses supplied to the two modules. 